Establishing Causality as a Desideratum for Memory Models and Transformations of Parallel Programs

نویسندگان

  • Chen Chen
  • Wenguang Chen
  • Vugranam Sreedhar
  • Rajkishore Barik
  • Vivek Sarkar
  • Guang Gao
چکیده

In this paper, we establish a notion of causality that should be used as a desideratum for memory models and code transformations of parallel programs. We introduce a Causal Acyclic Consistency (CAC) model which is weak enough to allow various useful code transformations, yet still strong enough to prevent any execution that exhibits “causal cycles” that may be caused by the Java Memory Model (JMM) [18]. For memory models, we introduce a graph model called causality graph that can be used to analyze if a particular program execution violates causality. By using causality graph, we show that a popular memory model (such as the Java memory model) can lead to program executions that exhibit causality violations with respect to our notion of causality. For code transformations, we establish criteria to identify transformations that are causality-preserving which do not result in any execution that exhibits causality violation. We showed that the CAC model allows all the causality-preserving transformations. Finally, we present preliminary experimental results for a load elimination optimization to motivate the performance benefit of using the CAC model relative to the Sequential Consistency (SC) model which is the most basic memory model. For the benchmark program studied, the number of getfield operations performed was reduced by 37.9% by using the CAC model instead of the SC model, and the execution time on a 16-core processor was reduced by 46.2%.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Formalizing Causality as a Desideratum for Memory Models and Transformations of Parallel Programs

It has been observed in previous work that it is desirable to avoid causal violations in any execution or transformation of a parallel program. In this paper, we formalize the notion of causality in memory consistency models and code transformations. For memory models, we introduce a framework of causality graph that can be used to analyze if a particular memory model violates causality. We sho...

متن کامل

An Approach to Parallelizing Fortran Programs using Rewriting Rules Technique

We present an ongoing research in the area of transforming existing sequential Fortran programs into their parallel equivalents. Our approach is to use rewriting rules technique in order to automate the transformation process. Sequential source code is transformed into parallel code for shared-memory systems, such as multicore processors. Parallelizing and optimizing transformations are formall...

متن کامل

Transformations for the Optimistic Parallel Execution of Object-oriented Programs

This paper discusses the use of optimistic execution as a mechanism for parallelizing sequential object-oriented programs. Most parallelizing compilers to date have used compile-time data-dependency analysis to determine independent sections of code. This reliance on static information presents an overly restrictive view of dependencies in a program. In this paper, a set of transformations is p...

متن کامل

A Message-Passing Distributed Memory Parallel Algorithm for a Dual-Code Thin Layer, Parabolized Navier-Stokes Solver

In this study, the results of parallelization of a 3-D dual code (Thin Layer, Parabolized Navier-Stokes solver) for solving supersonic turbulent flow around body and wing-body combinations are presented. As a serial code, TLNS solver is very time consuming and takes a large part of memory due to the iterative and lengthy computations. Also for complicated geometries, an exceeding number of grid...

متن کامل

A High Performance Parallel IP Lookup Technique Using Distributed Memory Organization and ISCB-Tree Data Structure

The IP Lookup Process is a key bottleneck in routing due to the increase in routing table size, increasing traıc and migration to IPv6 addresses. The IP address lookup involves computation of the Longest Prefix Matching (LPM), which existing solutions such as BSD Radix Tries, scale poorly when traıc in the router increases or when employed for IPv6 address lookups. In this paper, we describe a ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010